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(57) Abstract 



In a multichip integrated circuit module (4), the number of effective input/output pins (3, 5, respectively) is increased by 
using techniques of TDM (time division multiplexing). A first chip (I) has at least one output shift register (9). A second chip (1) 
has at least one input shift register (7). Interconnection wires (19) couple the output shift registers (9) and the input shift registers 
(7). Means (15) are provided for loading data in parallel to the output shift registers (9). Means (17) are provided for sequentially 
shifting data through the output shift registers (9) over the interconnections (19) and into the shift input registers (7). Embodi- 
ments of the invention are described for use in conjunction with bi-directional pins (25), tri-state output drivers (21), and asynch- 
ronous logic (18). 
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Description 
MULTICHIP IC DESIGN USING TDM 



Field of the Invention 

This invention pertains to the field of designing 
circuits having a plurality of integrated circuit (i.e.) 
5 chips, said design using techniques of TDM (time division 

multiplexing) . 

Description of Bagkgrpui><j Art 

Dobbelaere et al., "Field Programmable MCM Systems 
— Design of an Interconnection Frame" , IEEE 1992 Custom 

10 Integrated Circuits Conference , pp. 4.6.1-4.6.4, 

addresses the same problem addressed by the present 
invention — increasing the number of effective 
input/output pins in a multichip i.e. module — by 
different techniques. The references teaches programming 

15 a matrix at the corner of each chip to determine the 

interconnects among the inputs and outputs. The chips 
are FPGA' s (field programmable gate arrays). The 
reference does not disclose the use of shift registers or 
TDM. 

20 "Programmable Interconnect Architecture", Aptix 

Corporation Technology Backgrounder , Nov. 1991, pp. 1-14, 
and an Aptix press release dated January 1, 1992, 
describe use of an areal grid to preserve the ability to 
change the interconnects among a set of integrated 

25 circuit chips in applications where a logic is being 

implemented onto a set of multiple i.c.'s. These 
references do not disclose techniques of TDM. Shift 
registers are used, but only to determine which i.c.'s 
get connected to each other. The present invention uses 

30 shift registers to convey signal information from chip to 

chip. 

U.S. patent 5,036,473 uses dedicated FPGA' s solely 
to determine what active FPGA' s get connected to which 
others. The reference discloses the use of software to 
35 drive and observe signals, but does not disclose the use 

of shift registers or TDM. 

U.S. patent 5,109,353 discloses an array of 
programmable gate elements for emulating electronic 
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circuits and systems- It does not disclose the use of 
shift registers or TDM. 

"Handbook of Hardware Modeling", Logic Modeling 
Systems Incorporated, February 1992, describes the use of 
5 software to drive and observe signals in a multichip 

module. Techniques disclosed in this reference can be 
used in conjunction with the invention described herein. 

"Computer-Aided Prototyping", Quicktu rn Systems, 
Inc. . 1991, pp. 1-4, discloses partitioning logic for a 

10 set of interconnected FPGA' s and providing software 

stimulus to capture and observe the response. There is 
no disclosure of shift registers or TDM. This technology 
is further described in "FastForward" , LSI Logic, 
September 1991, and "MARS Product Overview", PiE Design 

15 Systems, Inc. Similar technology is described in three 

press releases by InCA Integrated Circuit Applications: 
"InCA 'Virtual ASIC Emulation System supports Xilinx 
4000 family FPGAS", June 8, 1992; "Virtual ASIC, 
Automatic ASIC Emulation from InCA", 1991; and "Concept 

20 Silicon partitions your design onto multiple FPGAs" . 

Disclosu re of Invention 

The present invention is a multichip integrated 
circuit module (4) comprising at least two integrated 
circuit chips (1) . The first chip (1) has at least one 

25 output shift register (9) . The second chip (1) has at 

least one input shift register (7) . Interconnections 
(19) couple the output register(s) (9) and the input 
shift register (s) (7). Means (15) are provided for 
loading data in parallel to the output shift register (s) 

30 (9) . Means (17) are provided for sequentially shifting 

data through the output shift register (s) (9) over the 
interconnections (19) and into the input shift 
register (s) (7) . 

Brief Description of the Drawings 
35 These and other more detailed and specific objects 

and features of the present invention are more fully- 
disclosed in the following specification, reference being 
had to the accompanying drawings, in which: 
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Figure 1 is a sketch of an embodiment of the present 
invention in which an integrated circuit chip 1 uses at- 
least one input shift register 7 and one output shift 
register 9. 

5 Figure 2 is a sketch showing how the techniques of 

the present invention reduce the number of 
interconnection wires 19 .among chips 1. 

Figure 3 is a sketch of an embodiment of the present 
invention in which the number of stages 24 in an input 
10 shift register 7 and output shift register 9 can be 

reduced by one. 

Figure 4 is a sketch of a chip 1 using tri-state 
output drivers 21. 

Figure 5 is a sketch of an embodiment of the present 
15 invention in which two output shift registers 9 are used 

in conjunction with a single output driver 21. 

Figure 6 is a sketch showing a plurality of 
tri-state output drivers 21 , each having a common gating 
signal 23. 

20 Figure 7 is a sketch showing how a single output 

shift register 9 can be used when the output drivers 21 
have a common gating signal 23. 

Figure 8 is a sketch showing the use of 
bi-directional pins and a plurality of tri-state output 

25 drivers 21 having a common gating signal 23. 

Figure 9 is a sketch showing how a single 
bi-directional pin 25 can be used when a plurality of 
tri-state output drivers 21 have a common gating signal 
23. 

30 Figure 10 shows a chip 1 having a plurality of . 

bi-directional pins 25, each coupled to an output driver 

21 that has a different gating signal. 

Figure 11 shows a single pin 25 equivalent to the 

embodiment depicted in Figure 10. 
35 Figure 12 shows an embodiment of the present 

invention in which asynchronous logic 18 is employed. 
Figure 13 shows an embodiment of the present 

invention in which a plurality of output shift registers 

9 are daisy chained together. 
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Figure 14 shows an embodiment of the present 
invention in which a plurality of test shift registers 12 
are multiplexed together. 

Figure 15 shows an embodiment of the present 
5 invention in which the TDM sequence must be applied twice 

Figure 16 shows an embodiment of the present 
invention in which an interconnect 19 couples two chips 1 

Figure 17 shows how the interconnect 19 of Figure 16 
can be reconfigured by reprogramming the first chip 1(1) . 
10 Figure 18 shows a single interconnect 19 coupling 

four chips 1. 

Figure 19 shows how the interconnect 19 of Figure 18 
can be reprogrammed by reprogramming the source chip 1(1) 

Figure 20 shows an embodiment of the present 
15 invention in which an interconnect wire 19 couples two 

chips 1 . 

Figure 21 shows how the interconnect 19 of Figure 20 
can be reprogrammed by adding a new signal S3 to the 
source chip 1(1). 
20 Figure 22 shows an embodiment of the present 

invention in which the number of stages 24 in an input 

shift register 7 can be reduced by one. 

Detailed Description of the Preferred Embodiments 

A major purpose of this invention is to increase the 
25 effective number of input/output (I/O) pins 3,5 on 

integrated circuit chips 1 within a module 4 that 
comprises a plurality of said chips 1. The invention can 
be thought of as creating a number of virtual I/O pins 
3,5 that is greater than the number of actual pins 3 f 5. 
30 The invention also eases pressures on the system designer 

when the number of interconnect wires 19 on the module 4 
is limited. Further, the invention enhances the 
reconf igurability of signal flow across chips 1. 

An example where the invention is useful is in the 
35 design of a system 4 in which there is a need to fit a 

given amount of logic, which may be provided in the form 
of a netlist, into one or more of the chips 1. Normally, 
said chips 1 come with a fixed number of I/O pins 3,5 
that are used for interconnections 19 among the chips 1. 
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The limited number of I/O pins 3,5 can greatly reduce the 
efficiency of utilizing the chips 1, and in some cases 
make the task of fitting the logic into the chips 1 
extremely difficult, 
5 The module 4 can be a production module, in which 

the chips 1 are executable and application-ready. 
Alternatively, module 4 can be a prototype module, in 
which the chips 1 are experimented with by the designer 
to create a system design. Changes in such a prototyping 
10 environment are typically done by a combination of 

hardware and software changes. Alternatively, module 4 
can contain some production chips 1 and some programmable 
chips 1. 

Chips 1 are any chips for which the user has control 
15 over the contents, such as FPGA' s (field programmable 

gate arrays) , non-field-programmable gate arrays, custom 
i.c.'s, semi-custom i.c.'s (application specific 
integrated circuits), and standard cell i.c.'s. FPGA' s 
are normally preferable, because of their flexibility. 
20 The present invention makes use of techniques of TDM 

(time division multiplexing or time domain 
multiplexing) . The execution of the system embodied in 
module 4 takes longer because of this, but in many 
applications this is of no concern. The TDM process is 
25 transparent to the operation of the logic embodied in 

module 4 . 

Typically the TDM is implemented by shift registers 
7,9, as illustrated in Figure 1. Each input pin 3 and 
each output pin 5 on a chip 1 is assigned a dedicated 

30 shift register, 7,9, respectively. The number of stages 

24 in a shift register 7,9 is referred to as N. N is any 
positive integer greater than or equal to 2, and could be 
typically between 2 and 5. Preferably, N is the same for 
all shift registers 7,9 on a chip 1. If N were not the 

35 same, a separate shift clock 17 would be needed for each 

different value of N. In the chip 1 illustrated in 
Figure 1, the value of N is the same, and therefore there 
is but one shift clock 17. 
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Each chip 1 can have a plurality of input pins 3 and 
a different number of output pins 5. The different chips 
1 on a module 4 can have different numbers of input pins 
3 and output pins 5 . 
5 The individual stages 24 of the shift registers 7,9 

can be, for example, flip-flops, edge triggered latches, 
and pairs of polarity hold latches. When polarity hold 
latches are used, a pair of shift clocks 17 is required 
for use with the corresponding shift register 7,9. 

10 The shift register 9 attached to an output pin 5 is 

referred to as an output shift register (OSR) 9, and 
functions as N virtual output pins 5. N internal signals 
from within the chip 1 can be loaded into the OSR 9 in 
parallel by means of activating a parallel load clock 15 

15 associated with the OSR 9. These signals are then 

serially shifted out of the OSR 9 over the corresponding 
output pin 5 using the shift clock 17 associated with 
said OSR 9. The signals travel over interconnection 
wires 19 to other chips 1 that need to receive the 

20 signals. 

Similarly, the shift register 7 attached to an input 
pin 3 is referred to as an input shift register (ISR) 7, 
and functions as N virtual input pins 3. It can receive 
serially N signals from a board interconnect 19 through 

25 the associated input pin 3 by means of the shift clock 17 

that is connected to said ISR 7. The received signals 
can then simultaneously be applied inside the chip 1 
using their stored states within the stages of the ISR 
7, No special clock is needed to unload the signals from 

30 the ISR 7, because once these signals are in the ISR 7, 

they are visible to the logic within the chip 1. 

Preferably, a single parallel load clock 15 and a 
single shift clock 17 are used for all the ISR' s 7 and 
OSR' s 9 on the board 4 . 

35 Preferably, the OSR' s 9, ISR' s 7, and attendant 

parallel load and shift clocks 15, 17 are customized 
within peripheral regions of the gate arrays 1 . 
Alternatively, the OSR' s 9 and ISR' s 7 are fabricated 
from logic normally present on the gate arrays 1. 
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Figure 2 illustrates how the invention minimizes the 
number of interconnect wires 19 among chips 1,2 and 
minimizes the number of I/O pins 3 f 5 within a chip 1. 
Figure 2 illustrates the interconnections of OSR 9 within 
5 a source chip 1 and three ISR' s 7 residing within three 

target (sink) chips 1. The target chips 1 could be 
identical or different in a hardware sense. A single 
interconnect wire 19 couples the chips 1. Three signals 
SI, S2 f and S3 from the source chip 1 are conveyed to the 

10 three sink chips 1. If the TDM and shift register 

technique were not used, three interconnects 19 would be 
required. The interconnect wire 19 can be thought of as 
a TDM bus which bundles in the time domain the three 
signals SI, S2 and S3. Said TDM bus 19 is visible to all 

15 three sink chips 1. 

As illustrated in Figure 2, not all of the three 
signals need to be used in each sink chip 1. (The 
signals within the sink chips 1 are primed for notational 
purposes.) 

20 The TDM process described herein is transparent to 

the intended logic design. This transparency can be 
achieved in different ways, depending upon the degree of 
transparency needed. For example, if all chips 1 on the 
module 4 are synchronously clocked (which is preferable) , 

25 the following two-step TDM process can be used, at a safe 

time after the application of each pulse from the system 
clock (not illustrated) . The safe time is that amount of 
time needed for all of the signals in the logic to 
achieve a steady state, i.e., when the intended design 

30 has been programmed into the chips 1. 

Step 1 . The parallel load clock 15 (which is 
preferably a single clock applied to all output shift 
registers 9 on the board 4) is applied to effect a 
parallel capture of signals into all of the OSR' s 9 on 

35 all the chips 1. 

Step 2 . Using the shift clock 17 (preferably a 
single clock used by all the ISR' s 7 and OSR f s 9 on all 
the chips 1), the contents of all of the OSR f s 9 are 
shifted into the target ISR' s 7 via the interconnects 
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19. This shifting process involves N applications of 
shift clock 17 to all of the chips 1. 

Exceptions to the above scheme exist for various 
reasons, such as: (1) Certain I/O pins are 
5 bi-directional pins 25 (see Figure 8) . Such 

bi-directional pins 25 could be left unmodified. Signals 
involving such pins 25 are not affected by the above two 
steps. (2) Inputs 3 that receive direct clock signals 
{as opposed to data signals) do not require the use of an 

10 ISR 7. Such would be undesirable, because it would cause 

the clock to jiggle. Similarly, output pins 5 that carry 
clock signals going to other chips 1 do not require the 
use of OSR' s 9. Clock signals cannot be transferred 
across chips 1 using TDM without significant impact to 

15 the intended logic. As such, it is not desirable to use 

such techniques for I/O pins 3,5 that carry clock signals. 

Figure 12 shows that if the logic being prototyped 
is asynchronous, an additional latch 29 is needed for 
each ISR 7 stage 24 that drives a piece of asynchronous 

20 logic 18. Such a latch 29 is a voltage level or logic 

level sensitive latch that receives its data from an ISR 
7 stage output and is clocked by yet another clock called 
the P clock 27. P clock 27 is shared among all chips 1 
that have asynchronous logic. The function of this added 

25 latch 29 is to screen the shifting of the ISR 7 from the 

asynchronous logic 18. P clock 27 is pulsed once after 
the completion of the two usual TDM steps, described 
earlier, that are used to effectuate the transfer of 
signals across chip 1 boundaries using TDM. As shown in 

30 Figure 12, some of the stages 24 of the ISR 7 can drive 

synchronous logic 16, and others of the stages 24 can 
drive asynchronous logic 18. 

As illustrated in Figures 13 and 14, multiple OSR' s 
9 can be connected into one daisy-chained composite "test 

35 shift register" (TSR) 12 on each chip 1 to facilitate the 

observation of the captured signals externally by 
shifting out the TSR 12. All of the observation can be 
performed at the output of pin 5 (M) . These observed 
signals can be used for debugging the prototype hardware, 
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depending upon the availability of extra board 4 logic 
and pins 3,5. The contents of the TSR 12 can then be 
reloaded from the output 5 (M) of the TSR 12 to the shift 
register data input 3 on the same chip 1 if it is desired 
5 to continue with the operation of the hardware. A 

preferred way of accomplishing this reloading is to make 
a connection on the chip 1 itself from output 5 (M) to 
input 3, thereby creating a circular shift register. 
The set of TSR' s 12 can be multiplexed to be 

10 observable at the output 10 of the board 4 as illustrated 

in Figure 14. Multiplexer select signals 14 control the 
selection of the individual TSR 12 outputs by multiplexer 
8. The input to all the TSR' s 12 is injected via board 
input 6. The scheme depicted in Figure 14 is useful when 

15 there are more chip output pins 5 that the designer wants 

to observe than there are available board output pins 10. 

Since the OSR f s 9 are usable for debugging purposes, 
the designer can choose to provide extra, initially 
unused, stages 24 within some OSR' s 9. These stages 24 

20 are then available to be used for observing internal 

signals that the hardware designer may not have thought 
about earlier, for example, by changing the programming 
on the chip 1 to look at these stages 24. 

It is possible to reconfigure the interconnects 19 

25 among the chips 1 by sending a different set of signals 

on each interconnect 19 than was previously intended - 
When combined with the internal reconf igurability of the 
FPGA 1 itself, a powerful new type of reconf igurability 
is thereby created. Figure 14 shows a module 4 with a 

30 number of FPGA' s 1 and set up for using TDM. To achieve 

a reprogrammable board 4, it is required that one must 
able to reprogram both the logic on the chips 1 as well 
as the interconnects 19. The on-chip logic is already 
rather reprogrammable through the use of the FPGA' s 1. 

35 To reprogram the interconnects 19, an FPGA 1 can be used 

as a programmable interconnect chip as in U.S. patent 
5,036,473, cited above; or a programmable interconnect 
chip as described in the Aptix references cited above can 
be used- Alternatively, several programmable 
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interconnect chips 1 may be used. A limited 
reprogrammability of interconnects is described in 
Figures 16-21. 

Figure 16 shows three signals, SI, S2, S3, TDM' ed 
5 between chips 1(1) and 1(2). By reprogramming the first 

FPGA 1(1), the manner in which the three signals SI, S2, 
S3 connect can be changed, as shown in Figure 17. In 
effect, the connections 19 between the two chips 1 have 
been rewired. 

10 Figure 18 shows four chips 1 with signals SI, S2, S3 

connected as shown. By reprogramming the first chip 
1(1), as shown in Figure 19, signal S3 crossing between 
chips 1(1) and chips 1(4) is changed to S4 . 

In Figure 20, two interconnected chips 1 are shown. 

15 Figure 21 shows how a signal can be added to the 

interconnect 19 just by reprogramming the individual chip 
1(1) and not touching the interconnect 19. 

The flexibility described herein results from chip 1 
being able to reprogram a number of signals that are 

20 TDM' ed across to other chips 1. Traditional techniques 

limit the reprogramability to a single signal. 

Once the value N is fixed for a given system 4, the 
total number of signals that can be moved across the chip 
1 boundaries is predetermined. The total number of 

25 signals movable is N multiplied by the number of unique 

interconnect wires 19 on the board 4. It is apparent 
that by increasing N, a capacity greater than what is 
initially needed for the total number of signals can be 
achieved. The additional capability is usable for 

30 greater flexibility in reconfiguring the interconnects 19. 

Theoretically, it is possible to accomplish all 
interconnections between any two chips 1 by using two 
interconnect wires 19 between them: one to carry signals 
flowing in one direction, and the second to carry signals 

35 flowing in the opposition direction. In this extreme 

situation of maximum multiplexing, the shift registers 
7,9 must have an N greater than or equal to the larger of 
the number of the input signals and the number of the 
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output signals on any single chip 1. (One signal is 
associated with each stage 24 of a shift register 7,9.) 

The techniques of the present invention can be 
selectively applied to areas of logic that are considered 
5 susceptible to change or require use of TDM to 

accommodate the number of signals that need to traverse 
across chip 1 boundaries. As such, ISR r s 7 and OSR' s 9 
need not be placed on all qualifying I/O pins 3 f 5. 
Instead, if there are insufficient resources on the chip 

10 1, the ISR's and OSR's 7,9 may be deployed only on enough 

I/O pins 3,5 to allow adequate signal flow through the 
chips 1. It should be noted that an interconnect 19 must 
have either ISR' s 7 and OSR' s 9 on all terminals of that 
interconnect 19, or else must have no ISR's 7 or OSR's 9 

15 at all. 

When the chips 1 are FPGA's, it is possible to make 
engineering changes to prototype hardware 4 by 
reprogramming the affected FPGA r s 1 to (1) change the 
logic realized by certain FPGA' s 1 and/or (2) modify the 

20 effective interconnects 19 by capturing different signals 

into one or more of the OSR's 9 and/or (3) change the 
signals that are made observable by capturing them into 
unused stages 24 of OSR's 9. 

Figure 22 shows an optimized version of the scheme 

25 in Figure 2. Here, the contents of the OSR 9 can be made 

observable, while reducing the number of stages 24 in an 
ISR 7 by one. In this embodiment, the state of the last 
stage 24 of an OSR 9 is used as a substitute for the 
eliminated stage 24 of an ISR 7. 

30 In certain prototyping situations, it may not be 

necessary to observe the contents of the OSR's 9. In 
these cases, the number of stages 24 needed in an ISR 7 
and OSR 9 can be reduced by one, corresponding to the 
stage 24 the system designer does not need to observe. 

35 This reduction is achieved by using the scheme depicted 

in Figure 3. Figure 3 shows how N equals 2 can provide 
the capability to multiplex three signals across a single 
interconnect 19. In this embodiment, the final state of 
the interconnect 19 after application of the TDM sequence 
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is used as one of the multiplexed signals for the sink 
chip 1(2). In contrast, the scheme depicted in Figure 2 
does not have the sink chips 1 dependent upon the final 
state of the interconnect 19, but only on the states of 
5 the ISR's 7. 

The scheme depicted in Figure 1 uses ISR's 7 and 
OSR' s 9 to multiple:: signals across chip 1 boundaries. 
Such a TDM sequence serves to transfer signals once 
across an interconnect 19. However, if there is a logic 

10 path that is not interrupted by a latch or other 

sequential logic (other than ISR's 7 and OSR' s 9) and 
spans more than two FPGA chips 1, then the TDM sequence 
needs to be repeated. The total number of TDM sequence 
applications required is equal to the number of distinct 

15 board level interconnect wires 19 that are included in 

the logic path. Figure 15 shows a logic path of 
combinational logic 22 that spans two board level 
interconnect wires 19(1) and 19(2). There are no latches 
other than ISR's 7 and OSR's 9 in the path. As such, the 

20 TDM sequence needs to be applied twice to move the signal 

through the three chips 1. Notice that ISR f s 7 and OSR' s 
9 force the use of TDM. Without ISR's 7 and OSR' s 9 r the 
signal from chip 1(1) to chip 1(2) to chip 1(3) would 
flow without any clocking. 

25 The techniques of the present invention can be 

extended to bi-directional pins 25 and/or three state 
output drivers 21. (See Figure 4) . Such a driver 21 
typically has outputs of logical zero, logical one, and 
high impedance. 

30 Buses 19 generally employ tri-state drivers 21 to 

allow multiple source chips 1 onto a single wire. Such 
buses 19 are intended by the designer to achieve 
efficient interconnects between a multitude of source 
chips 1 and some number of sink chips 1. Generally, one 

35 source chip output driver 21 is active while others are 

in a high impedance state, so that one source chip output 
driver 21 is driving all receivers in the bus 19. The 
architecture shown in Figure 1 cannot be employed 
directly in such a case. Figure 4 shows N output pins 5, 
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with a tri-state driver 21 on each pin 5. The OSR 9 
scheme of Figure 1 will not work here, because the high 
impedance state cannot be transmitted via an OSR 9 and an 
ISR 7 . Instead, the scheme shown in Figure 5 needs to be 
5 employed. It should be noted that this scheme uses two 

OSR' s 9 instead of one: the first OSR 9(1) to capture 
the N gating signals and the second OSR 9(2) to capture 
the N data signals. Output driver 21 sees the 
corresponding gate-data combination as the OSR' s 9 are 

10 shifted out. Thus, output pin 5 sees the gated output 

from output driver 21. The generated TDM sequence of 
signals on output pin 5 combines with similarly generated 
TDM sequences from other output pins 5 that are connected 
with this first output pin 5. 

15 Figure 6 shows N output pins 5 with high impedance 

drivers 21 that share a common output gating signal 23. 
Signal 23 either produces a high impedance on the outputs 
of drivers 21 or else transmits the input signals to the 
outputs. Figure 7 shows a corresponding single pin 

20 architecture that avoids the use of two OSR f s 9 but still 

places output driver 21 between the single OSR 9 and the 
output pin 5. The use of two OSR' s 9 is avoided because 
of the presence of the common gating signal 23. 

Figure 8 shows a typical N bi-directional pin 25 

25 configuration with a common output gating signal 23. In 

Figure 8, the same pin 25 is used to both output a signal 
from the chip 1 as well as to receive signals from the 
outside. To achieve a single pin 25 architecture for TDM 
of the N signals, the scheme of Figure 9 is employed. In 

30 Figure 9, the architecture of Figure 7 is used to feed 

the outputs of the N drivers 21, and a single ISR 7 is 
used for the N input signals as in the Figure 1 
embodiment . 

Figure 10 shows N bi-directional pins 25, each with 
35 a different gating output signal. The single-pin 25 

architecture equivalent for TDM of the N signals uses the 
scheme shown in Figure 11. In Figure 11, the 
architecture of Figure 5 is employed for the N output 
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drivers 21, and a single ISR 7 is applied to the N input 
signals as in the Figure 1 embodiment. 

Personalization instructions are injected into 
module 4, e.g., by techniques described in U.S. patent 
5 5,109,353 cited above. More than one logic design can be 

introduced into the module 4 by different personalization 
instructions. Software that is intended to provide 
automatic or interactive partitioning of the intended 
system design, such as Concept Silicon from InCA cited 

10 above, can exploit the knowledge of the existence of 

ISR's 7 and OSR' s 9 to help pack more logic into each of 
the programmable chips 1 and/or to minimize the number of 
interconnection wires 19. 

The software is executed on a computer external to 

15 module 4, e.g., a workstation. The software can be 

standalone (not physically coupled to module 4) software 
that exploits the hardware architecture of the module 4 
as introduced by the invention. Alternatively, there can 
be physical coupling between the software and the module 

20 4, e.g., the computer on which the software is executed 

can have an electrical connection to the module 4, over 
which the personalization created by the software is 
downloaded into the module 4 . 

The benefit of minimizing the number of chips 1 is 

25 obvious. The benefit of minimizing the number of 

interconnects 19 is significant. If the prototype board 
4 is intended to utilize Aptix-type programmable 
interconnect chips, minimizing the number of 
interconnects 19 will reduce the number of Aptix 

30 components needed to achieve programmable interconnects. 

Either the software is told the number of virtual I/O's 
3,5 rather than the actual number; or else the software 
is told all about the ISR's 7 and OSR' s 9 so that it will 
take this information into account when it does the 

35 partitioning. The programmable interconnect chip does 

not always need to be changed. Rather, ISR's 7 and/or 
OSR' s 9 are placed on the chips 1 to which the 
programmable interconnect chip is connected, thereby 
enhancing the programmable interconnect chip. 
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Combining the use of the present invention with the 
use of one or more programmable interconnect chips can 
allow the board 4 designer the flexibility of creating 
additional interconnects 19 among chips 1. Programmable 
5 interconnect chips are -used to interconnect among two or 

more I/O pins 3,5. These I/O pins 3,5 could also have 
ISR' s 7 and OSR' s 9 on board the chip 1, thus providing a 
greatly increased number of programmable interconnects • 

The configuration of the prototype board 4 is 
10 changed by a combination of hardware and software. For 

hardware changes, interconnects 19 are reprogrammed, 
FPGA' s and/or programmable interconnect chips 1 are 
reprogrammed, FPGA' s 1 are added or subtracted, and 
connections are obliterated using lasers. For software 
15 changes, software other than the partitioning software is 

used to, e.g., reprogram the Aptix chip(s) 1. 

The above description is included to illustrate the 
operation of the preferred embodiments and is not meant 
to limit the scope of the invention. The scope of the 
20 invention is to limited only by the following claims. 

From the above discussion, many variations will be 
apparent to one skilled in the art that would yet be 
encompassed by the spirit and scope of the invention. 

What is claimed is: 
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Claim? 

1. A multichip i.e. module comprising: 

at least two integrated circuit chips, a first 
chip having at least one output shift register and a 
5 second chip having at least one input shift register; 

interconnections coupling the output shift 
register (s) and; the input shift register (s); 

means for loading signals in parallel to the 
output shift register (s); and 
10 means for sequentially shifting signals through 

the output shift register (s) over the interconnections 
and into the input shift register (s). 

2. The multichip i.e.. module of claim 1 wherein 
all chips are clocked synchronously. 

15 3. The multichip i.e. module of claim 1 wherein at 

least one chip contains asynchronous logic, said chip 
comprising an input shift register that is coupled to 
said asynchronous logic via a polarity hold latch; 
wherein: 

20 said polarity hold latch is clocked by a P 

clock* 

4. The multichip i.e. module of claim 1 wherein at 
least some of the chips are gate arrays from the set 
comprising FPGA's and non-field-programmable gate arrays. 
25 5. The multichip i.e. module of claim 4 wherein 

the output shift register (s), the input shift 
register (s) f and associated clocking are customized 
within peripheral regions of the gate arrays. 

6. The multichip i.e. module of claim 4 wherein 
30 the output shift register (s) and input shift register (s) 

are fabricated from logic normally present on the gate 
arrays . 

7. The multichip i.e. module of claim 1 wherein 
the means for loading signals is a parallel load clock. 

35 8. The multichip i.e. module of claim 1 wherein 

the module is a production module containing executable 
chips . 
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9. The multichip i.e. module of claim 1 wherein 
the multichip module is a prototype module containing at 
least some experimental chips • 

10 . The multichip i.e. module of claim 1 further 
5 comprising software means to partition logic among the 

chips , said software means exploiting information 
concerning the particular architecture of the module. 

11. The multichip i.e. module of claim 1 wherein 
said module contains a programmable interconnect chip. 

10 12. In a multichip i.e. module comprising at least 

two integrated circuit chips , each chip having a 
plurality of I/O pins, said module having inter-i.e. 
connection wires interconnecting said pins, a method for 
increasing the effective number of I/O pins, said method 

15 comprising the steps of: 

time division multiplexing signals at an output 
pin of a first chip; and 

sending said signals over a connection wire to 
an input pin of a second chip. 

20 13- The method of claim 12 wherein all chips are 

clocked synchronous ly . 

14. The method of claim 12 wherein said step of 
time division multiplexing comprises the substeps of: 

parallel loading signals into an output shift 
25 register within a first chip; and 

accessing signals in parallel from an input 
shift register within a second chip. 

15. The method of claim 14 wherein at least some of 
the chips are gate arrays from the set comprising FPGA' s 

30 and non-field-programmable gate arrays. 

16. The method of claim 15 wherein the output shift 
register (s) and input shift register (s) are customized 
within peripheral regions of said gate arrays. 

17. The method of claim 15 wherein the output shift 
35 register (s) and input shift register (s) are fabricated 

from logic normally present on said gate arrays. 

18- The method of claim 14 wherein the input and 
output shift registers comprise stages fabricated from 
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items from the group of items comprising flip-flops/ edge 
triggered latches, and pairs of polarity hold latches. 

19. The method of claim 18 wherein one of said 
stages in reserved for debugging purposes. 
5 20. The method of claim 12 further comprising the 

additional step of determining a logic design for the 
module by use of partitioning software that exploits 
information concerning the particular architecture of the 
module. 

10 21. The method of claim 12 further comprising the 

step of observing said signals at said output pin for 
test purposes. 
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Description 
MULTICHIP IC DESIGN USING TDM 



Field Qf the Invention 

This invention pertains to the field of designing 
circuits having a plurality of integrated circuit (i.e.) 
5 chips, said design using techniques of TDM (time division 

multiplexing) . 

Description of Background Art 

Dobbelaere et al., "Field Programmable MCM Systems 
— Design of an Interconnection Frame" r IEEE 1992 Custom 

10 Jntegr^ted Circuits Confer ence , pp. 4.6.1-4.6.4, 

addresses the same problem addressed by the present 
invention — increasing the number of effective 
input /output pins in a multichip i.e. module — by 
different techniques. The references teaches programming 

15 a matrix at the corner of each chip to determine the 

interconnects among the inputs and outputs. The chips 
are FPGA' s (field programmable gate arrays). The 
reference does not disclose the use of shift registers or 
TDM. 

20 "Programmable Interconnect Architecture", Aptix 

Corporation Technology Backgrounder , Nov. 1991, pp. 1-14, 
and an Aptix press release dated January 1, 1992, 
describe use of an areal grid to preserve the ability to 
change the interconnects among a set of integrated 

25 circuit chips in applications where a logic is being 

implemented onto a set of multiple i.c.'s. These 
references do not disclose techniques of TDM. Shift 
registers are used, but only to determine which i.c.'s 
get connected to each other. The present invention uses 

30 shift registers to convey signal information from chip to 

chip . 

U.S. patent 5,036,473 uses dedicated FPGA' s solely 
to determine what active FPGA' s get connected to which 
others. The reference discloses the use of software to 
35 drive and observe signals, but does not disclose the use 

of shift registers or TDM. 

U.S. patent 5,109,353 discloses an array of 
programmable gate elements for emulating electronic 
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circuits and systems. It does not disclose the use of 
shift registers or TDM. 

"Handbook of Hardware Modeling", Logic Modeling 
Systems Incorporated, February 1992, describes the use of 
5 software to drive and observe signals in a multichip 

module. Techniques disclosed in this reference can be 
used in conjunction with the invention described herein. 

"Computer-Aided Prototyping", QujcKtuyn Systems r 
Inc. , 1991, pp. 1-4, discloses partitioning logic for a 

10 set of interconnected FPGA f s and providing software 

stimulus to capture and observe the response. There is 
no disclosure of shift registers or TDM. This technology 
is further described in "FastForward", LSI Logic, 
September 1991, and "MARS Product Overview", PiE Design 

15 Systems, Inc. Similar technology is described in three 

press releases by InCA Integrated Circuit Applications: 
"InCA 'Virtual ASIC Emulation System supports Xilinx 
4000 family FPGAS", June 8, 1992; "Virtual ASIC, 
Automatic ASIC Emulation from InCA", 1991; and "Concept 

20 Silicon partitions your design onto multiple FPGAs". 

Disclosure of Invention 

The present invention is a multichip integrated 
circuit module (4) comprising at least two integrated 
circuit chips (1) . The first chip (1) has at least one 

25 output shift register (9) . The second chip (1) has at 

least one input shift register (7) . Interconnections 
(19) couple the output register (s) (9) and the input 
shift register(s) (7). Means (15) are provided for 
loading data in parallel to the output shift register (s) 

30 (9). Means (17) are provided for sequentially shifting 

data through the output shift register (s) (9) over the 
interconnections (19) and into the input shift 
register(s) (7) . 

Brief Des cription of the Drawings 
35 These and other more detailed and specific objects 

and features of the present invention are more fully- 
disclosed in the following specification, reference being 
had to the accompanying drawings, in which: 
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Figure 1 is a sketch of an embodiment of the present 
invention in which an integrated circuit chip 1 uses at. 
least one input shift register 7 and one output shift 
register 9. 

5 Figure 2 is a sketch showing how the techniques of 

^ the present invention reduce the number of 

interconnection wires 19 .among chips 1. 

Figure 3 is a sketch of an embodiment of the present 
invention in which the number of stages 24 in an input 
10 shift register 7 and output shift register 9 can be 

reduced by one. 

Figure 4 is a sketch of a chip 1 using tri-state. 
output drivers 21. 

Figure 5 is a sketch of an embodiment of the present 
15 invention in which two output shift registers 9 are used 

in conjunction with a single output driver 21. 

Figure 6 is a sketch showing a plurality of 
tri-state output drivers 21, each having a common gating 
signal 23. 

20 Figure 7 is a sketch showing how a single output 

shift register 9 can be used when the output drivers 21 
have a common gating signal 23. 

Figure 8 is a sketch showing the use of 
bi-directional pins and a plurality of tri-state output 

25 drivers 21 having a common gating signal 23. 

Figure 9 is a sketch showing how a single 
bi-directional pin 25 can be used when a plurality of 
tri-state output drivers 21 have a common gating signal 
23. 

30 Figure 10 shows a chip 1 having a plurality of 

bi-directional pins 25 , each coupled to an output driver 

21 that has a different gating signal. 

Figure 11 shows a single pin 25 equivalent to the 

embodiment depicted in Figure 10. 
35 Figure 12 shows an embodiment of the present 

invention in which asynchronous logic 18 is employed. 
Figure 13 shows an embodiment of the present 

invention in which a plurality of output shift registers 

9 are daisy chained together. 
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Figure 14 shows an embodiment of the present 
invention in which a plurality of test shift registers 12 
are multiplexed together. 

Figure 15 shows an embodiment of the present 
5 invention in which the TDM sequence must be applied twice 

Figure 16 shows an embodiment of the present 
invention in which an interconnect 19 couples two chips 1 

Figure 17 shows how the interconnect 19 of Figure 16 
can be reconfigured by reprogramming the first chip 1(1). 
10 Figure 18 shows a single interconnect 19 coupling 

four chips 1 . 

Figure 19 shows how the interconnect 19 of Figure 18 
can be reprogrammed by reprogramming the source chip 1(1) 

Figure 20 shows an embodiment of the present 
15 Invention in which an interconnect wire 19 couples two 

chips 1 . 

Figure 21 shows how the interconnect 19 of Figure 20 
can be reprogrammed by adding a new signal S3 to the 
source chip 1(1). 

20 Figure 22 shows an embodiment of the present 

invention in which the number of stages 24 in an input 

shift register 7 can be reduced by one. 

Detailed Description of the Preferred Embodiments 

A major purpose of this invention is to increase the 

25 effective number of input/output (I/O) pins 3 f 5 on 

integrated circuit chips 1 within a module 4 that 
comprises a plurality of said chips 1. The invention can 
be thought of as creating a number of virtual I/O pins 
3,5 that is greater than the number of actual pins 3,5. 

30 The invention also eases pressures on the system designer 

when the number of interconnect wires 19 on the module 4 
is limited. Further, the invention enhances the 
reconf igurability of signal flow across chips 1. 

An example where the invention is useful is in the 

35 design of a system 4 in which there is a need to fit a 

given amount of logic, which may be provided in the form 
of a netlist, into one or more of the chips 1. Normally, 
said chips 1 come with a fixed number of I/O pins 3,5 
that are used for interconnections 19 among the chips 1. 
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The limited number of I/O pins 3,5 can greatly reduce the 
efficiency of utilizing the chips 1, and in some cases 
make the task of fitting the logic into the chips 1 
extremely difficult. 
5 The module 4 can be a production module, in which 

the chips 1 are executable and application-ready. 
Alternatively, module 4 can be a prototype module, in 
which the chips 1 are experimented with by the designer 
to create a system design. Changes in such a prototyping 
10 environment are typically done by a combination of 

hardware and software changes. Alternatively, module 4 
can contain some production chips 1 and some programmable 
chips 1 . 

Chips 1 are any chips for which the user has control 
15 over the contents, such as FPGA's (field programmable 

gate arrays), non-field-programmable gate arrays, custom 
i.c.'s, semi-custom i.c.'s (application specific 
integrated circuits), and standard cell i.c.'s. FPGA's 
are normally preferable, because of their flexibility* 
20 The present invention makes use of techniques of TDM 

(time division multiplexing or time domain 
multiplexing) . The execution of the system embodied in 
module 4 takes longer because of this, but in many 
applications this is of no concern. The TDM process is 
25 transparent to the operation of the logic embodied in 

module 4 . 

Typically the TDM is implemented by shift registers 
7,9, as illustrated in Figure 1. Each input pin 3 and 
each output pin 5 on a chip 1 is assigned a dedicated 

30 shift register, 7,9, respectively. The number of stages 

24 in a shift register 7,9 is referred to as N. N is any 
positive integer greater than or equal to 2, and could be 
typically between 2 and 5. Preferably, N is the same for 
all shift registers 7,9 on a chip 1. If N were not the 

35 same, a separate shift clock 17 would be needed for each 

different value of N. In the chip 1 illustrated in 
Figure 1, the value of N is the same, and therefore there 
is but one shift clock 17. 
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Each chip 1 can have a plurality of input pins 3 and 
a different number of output pins 5. The different chips 
1 on a module 4 can have different numbers of input pins 
3 and output pins 5. 
5 The individual stages 24 of the shift registers 7,9 

can be, for example, flip-flops, edge triggered latches f 
and pairs of polarity hold latches • When polarity hold 
latches are used, a pair of shift clocks 17 is required 
for use with the corresponding shift register 7,9. 

10 The shift register 9 attached to an output pin 5 is 

referred to as an output shift register (OSR) 9, and 
functions as N virtual output pins 5. N internal signals 
from within the chip 1 can be loaded into the OSR 9 in 
parallel by means of activating a parallel load clock 15 

15 associated with the OSR 9. These signals are then 

serially shifted out of the OSR 9 over the corresponding 
output pin 5 using the shift clock 17 associated with 
said OSR 9. The signals travel over interconnection 
wires 19 to other chips 1 that need to receive the 

20 signals - 

Similarly, the shift register 7 attached to an input 
pin 3 is referred to as an input shift register (ISR) 7, 
and functions as N virtual input pins 3 - It can receive 
serially N signals from a board interconnect 19 through 

25 the associated input pin 3 by means of the shift clock 17 

that is connected to said ISR 7. The received signals 
can then simultaneously be applied inside the chip 1 
using their stored states within the stages of the ISR 
7. No special clock is needed to unload the signals from 

30 the ISR 7, because once these signals are in the ISR 7, 

they are visible to the logic within the chip 1. 

Preferably, a single parallel load clock 15 and a 
single shift clock 17 are used for all the ISR' s 7 and 
OSR r s 9 on the board 4 . 

35 Preferably , the OSR' s 9, ISR's 7, and attendant 

parallel load and shift clocks 15, 17 are customized 
within peripheral regions of the gate arrays 1. 
Alternatively, the OSR' s 9 and ISR's 7 are fabricated 
from logic normally present on the gate arrays 1- 
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Figure 2 illustrates how the invention minimizes the 
number of interconnect wires 19 among chips 1,2 and 
minimizes the number of I/O pins 3,5 within a chip 1. 
Figure 2 illustrates the interconnections of OSR 9 within 
5 a source chip 1 and three ISR's 7 residing within three 

target (sink) chips 1* The target chips 1 could be 
identical or different in a hardware sense. A single 
interconnect wire 19 couples the chips 1. Three signals 
SI, S2, and S3 from the source chip 1 are conveyed to the 

10 three sink chips 1. If the TDM and shift register 

technique were not used, three interconnects 19 would be 
required. The interconnect wire 19 can be thought of as 
a TDM bus which bundles in the time domain the three 
signals SI, S2 and S3. Said TDM bus 19 is visible to all 

15 three sink chips 1. 

As illustrated in Figure 2, not all of the three 
signals need to be used in each sink chip 1. (The 
signals within the sink chips 1 are primed for notational 
purposes . ) 

20 The TDM process described herein is transparent to 

the intended logic design. This transparency can be 
achieved in different ways, depending upon the degree of 
transparency needed. For example, if all chips 1 on the 
module 4 are synchronously clocked (which is preferable) , 

25 the following two-step TDM process can be used, at a safe 

time after the application of each pulse from the system 
clock (not illustrated) . The safe time is that amount of 
time needed for all of the signals in the logic to 
achieve a steady state, i.e., when the intended design 

30 has been programmed into the chips 1. 

Step 1, The parallel load clock 15 (which is 
preferably a single clock applied to all output shift 
registers 9 on the board 4) is applied to effect a 
parallel capture of signals into all of the OSR f s 9 on 

35 all the chips 1. 

Step 2. Using the shift clock 17 (preferably a 
single clock used by all the ISR's 7 and OSR' s 9 on all 
the chips 1), the contents of all of the OSR' s 9 are 
shifted into the target ISR's 7 via the interconnects 
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19. This shifting process involves N applications of 
shift clock 17 to all of the chips 1. 

Exceptions to the above scheme exist for various 
reasons, such as: (1) Certain I/O pins are 
5 bi-directional pins 25 (see Figure 8) . Such 

bi-directional pins 25 could be left unmodified. Signals 
involving such pins 25 are not affected by the above two 
steps. (2) Inputs 3 that receive direct clock signals 
(as opposed to data signals) do not require the use of an 

10 ISR 7. Such would be undesirable, because it would cause 

the clock to jiggle. Similarly, output pins 5 that carry 
clock signals going to other chips 1 do not require the 
use of OSR' s 9. Clock signals cannot be transferred 
across chips 1 using TDM without significant impact to 

15 the intended logic. As such r it is not desirable to use 

such techniques for I/O pins 3,5 that carry clock signals. 

Figure 12 shows that if the logic being prototyped 
is asynchronous, an additional latch 29 is needed for 
each ISR 7 stage 24 that drives a piece of asynchronous 

20 logic 18. Such a latch 29 is a voltage level or logic 

level sensitive latch that receives its data from an ISR 
7 stage output and is clocked by yet another clock called 
the P clock 27. P clock 27 is shared among all chips 1 
that have asynchronous logic. The function of this added 

25 latch 2 9 is to screen the shifting of the ISR 7 from the 

asynchronous logic 18. P clock 27 is pulsed once after 
the completion of the two usual TDM steps, described 
earlier, that are used to effectuate the transfer of 
signals across chip 1 boundaries using TDM. As shown in 

30 Figure 12, some of the stages 24 of the ISR 7 can drive 

synchronous logic 16, and others of the stages 24 can 
drive asynchronous logic 18. 

As illustrated in Figures 13 and 14, multiple OSR' s 
9 can be connected into one daisy-chained composite "test 

35 shift register" (TSR) 12 on each chip 1 to facilitate the 

observation of the captured signals externally by 
shifting out the TSR 12. All of the observation can be 
performed at the output of pin 5 (M) . These observed 
signals can be used for debugging the prototype hardware, 
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depending upon the availability of extra board 4 logic 
and pins 3,5. The contents of the TSR 12 can then be 
reloaded from the output 5 (M) of the TSR 12 to the shift 
register data input 3 on the same chip 1 if it is desired 
5 to continue with the operation of the hardware. A 

preferred way of accomplishing this reloading is to make 
a connection on the chip 1 itself from output 5 (M) to 
input 3, thereby creating a circular shift register. 
The set of TSR' s 12 can be multiplexed to be 

10 observable at the output 10 of the board 4 as illustrated 

in Figure 14. Multiplexer select signals 14 control the 
selection of the individual TSR 12 outputs by multiplexer 
8. The input to all the TSR' s 12 is injected via board 
input 6. The scheme depicted in Figure 14 is useful when 

15 there are more chip output pins 5 that the designer wants 

to observe than there are available board output pins 10. 

Since the OSR' s 9 are usable for debugging purposes, 
the designer can choose to provide extra, initially 
unused, stages 24 within some OSR's 9. These stages 24 

20 are then available to be used for observing internal 

signals that the hardware designer may not have thought 
about earlier, for example, by changing the programming 
on the chip 1 to look at these stages 24. 

It is possible to reconfigure the interconnects 19 

25 among the chips 1 by sending a different set of signals 

on each interconnect 19 than was previously intended. 
When combined with the internal reconf igurability of the 
FPGA 1 itself, a powerful new type of reconf igurability 
is thereby created. Figure 14 shows a module 4 with a 

30 number of FPGA' s 1 and set up for using TDM. To achieve 

a reprogrammable board 4, it is required that one must 
able to reprogram both the logic on the chips 1 as well 
as the interconnects 19- The on-chip logic is already 
rather reprogrammable through the use of the FPGA' s 1 . 

35 To reprogram the interconnects 19, an FPGA 1 can be used 

as a programmable interconnect chip as in U.S. patent 
5,036,473, cited above; or a programmable interconnect 
chip as described in the Aptix references cited above can 
be useck- Alternatively, several programmable 
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interconnect chips 1 may be used. A limited 
reprogrammability of interconnects is described in 
Figures 16-21. 

Figure 16 shows three signals, Sl f S2, S3, TDM f ed 
5 between chips 1(1) and 1(2). By reprogramming the first 

FPGA 1(1) r the manner in which the three signals SI, S2, 
S3 connect can be changed, as shown in Figure 17. In 
effect, the connections 19 between the two chips 1 have 
been rewired. 

10 Figure 18 shows four chips 1 with signals SI, S2, S3 

connected as shown. By reprogramming the first chip 
1(2), as shown in Figure 19, signal S3 crossing between 
chips 1(1) and chips 1(4) is changed to S4 . 

In Figure 20, two interconnected chips 1 are shown. 

15 Figure 21 shows how a signal can be added to the 

interconnect 19 just by reprogramming the individual chip 
1(1) and not touching the interconnect 19. 

The flexibility described herein results from chip 1 
being able to reprogram a number of signals that are 

20 TDM' ed across to other chips 1- Traditional techniques 

limit the reprogramability to a single signal. 

Once the value N is fixed for a given system 4, the 
total number of signals that can be moved across the chip 
1 boundaries is predetermined. The total number of 

25 signals movable is N multiplied by the number of unique 

interconnect wires 19 on the board 4. It is apparent 
that by increasing N, a capacity greater than what is 
initially needed for the total number of signals can be 
achieved. The additional capability is usable for 

30 greater flexibility in reconfiguring the interconnects 19. 

Theoretically, it is possible to accomplish all 
interconnections between any two chips 1 by using two 
interconnect wires 19 between them: one to carry signals 
flowing in one direction, and the second to carry signals 

35 flowing in the opposition direction. In this extreme 

situation of maximum multiplexing, the shift registers 
7,9 must have an N greater than or equal to the larger of 
the number of the input signals and the number of the 
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output signals on any single chip 1. (One signal is 
associated with each stage 24 of a shift register 7,9.) 

The techniques of the present invention can be 
selectively applied to areas of logic that are considered 
5 susceptible to change or require use of TDM to 

accommodate the number of signals that need to traverse 
across chip 1 boundaries. As such, ISR's 7 and OSR' s 9 
need not be placed on all qualifying I/O pins 3,5. 
Instead, if there are insufficient resources on the chip 

10 1, the ISR's and OSR' s 7,9 may be deployed only on enough 

I/O pins 3,5 to allow adequate signal flow through the 
chips 1. It should be noted that an interconnect 19 must 
have either ISR's 7 and OSR' s 9 on all terminals of that 
interconnect 19, or else must have no ISR's 7 or OSR' s 9 

15 at all. 

When the chips 1 are FPGA's, it is possible to make 
engineering changes to prototype hardware 4 by 
reprogramming the affected FPGA's 1 to (1) change the 
logic realizied by certain FPGA's 1 and/or (2) modify the 

20 effective interconnects 19 by capturing different signals 

into one or more of the OSR' s 9 and/or (3) change the 
signals that are made observable by capturing them into 
unused stages 24 of OSR' s 9. 

Figure 22 shows an optimized version of the scheme 

25 in Figure 2. Here, the contents of the OSR 9 can be made 

observable, while reducing the number of stages 24 in an 
ISR 7 by one. In this embodiment, the state of the last 
stage 24 of an OSR 9 is used as a substitute for the 
eliminated stage 24 of an ISR 7. 

30 In certain prototyping situations, it may not be 

necessary to observe the contents of the OSR' s 9. In 
these cases, the number of stages 24 needed in an ISR 7 
and OSR 9 can be reduced by one, corresponding to the 
stage 24 the system designer does not need to observe. 

35 This reduction is achieved by using the scheme depicted 

in Figure 3. Figure 3 shows how N equals 2 can provide 
the capability to multiple;: three signals across a single 
interconnect 19. In this embodiment, the final state of 
the interconnect 19 after application of the TDM sequence 
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is used as one of the multiplexed signals for the sink 
chip 1(2). In contrast, the scheme depicted in Figure 2 
does not have the sink chips 1 dependent upon the final 
state of the interconnect 19, but only on the states of 
5 the ISR's 7. 

The scheme depicted in Figure 1 uses ISR's 7 and 
OSR' s 9 to multiplex signals across chip 1 boundaries. 
Such a TDM sequence serves to transfer signals once 
across an interconnect 19. However, if there is a logic 

10 path that is not interrupted by a latch or other 

sequential logic (other than ISR's 7 and OSR' s 9) and 
spans more than two FPGA chips 1, then the TDM sequence 
needs to be repeated. The total number of TDM sequence 
applications required is equal to the number of distinct 

15 board level interconnect wires 19 that are included in 

the logic path. Figure 15 shows a logic path of 
combinational logic 22 that spans two board level 
interconnect wires 19(1) and 19(2). There are no latches 
other than ISR's 7 and OSR's 9 in the path. As such*, the 

20 TDM sequence needs to be applied twice to move the signal 

through the three chips 1 . Notice that ISR' s 7 and OSR' s 
9 force the use of TDM. Without ISR's 7 and OSR' s 9, the 
signal from chip 1(1) to chip 1(2) to chip 1(3) would 
flow without any clocking. 

25 The techniques of the present invention can be 

extended to bi-directional pins 25 and/or three state 
output drivers 21. (See Figure 4). Such a driver 21 
typically has outputs of logical zero, logical one f and 
high impedance . 

30 Buses 19 generally employ tri-state drivers 21 to 

allow multiple source chips 1 onto a single wire. Such 
buses 19 are intended by the designer to achieve 
efficient interconnects between a multitude of source 
chips 1 and some number of sink chips 1. Generally, one 

35 source chip output driver 21 is active while others are 

in a high impedance state, so that one source chip output 
driver 21 is driving all receivers in the bus 19. The 
architecture shown in Figure 1 cannot be employed 
directly in such a case. Figure 4 shows N output pins 5, 
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with a tri-state driver 21 on each pin 5. The OSR 9 ' 
scheme of Figure 1 will not work here, because the high 
impedance state cannot be transmitted via an OSR 9 and an 
ISR 7 . Instead, the scheme shown in Figure 5 needs to be 
5 employed. It should be noted that this scheme uses two 

OSR's 9 instead of one: the first OSR 9(1) to capture 
the N gating signals and the second OSR 9(2) to capture 
the N data signals. Output driver 21 sees the 
corresponding gate-data combination as the OSR' s 9 are 

10 shifted out. Thus, output pin 5 sees the gated output 

from output driver 21. The generated TDM sequence of 
signals on output pin 5 combines with similarly generated 
TDM sequences from other output pins 5 that are connected 
with this first output pin 5. 

15 Figure 6 shows N output pins 5 with high impedance 

drivers 21 that share a common output gating signal 23. 
Signal 23 either produces a high impedance on the outputs 
of drivers 21 or else transmits the input signals to the 
outputs* Figure 7 shows a corresponding single pin 

20 architecture that avoids the use of two OSR's 9 but still 

places output driver 21 between the single OSR 9 and the 
output pin 5. The use of two OSR' s 9 is avoided because 
of the presence of the common gating signal 23. 

Figure 8 shows a typical N bi-directional pin 25 

25 configuration with a common output gating signal 23. In 

Figure 8, the same pin 25 is used to both output a signal 
from the chip 1 as well as to receive signals from the 
outside. To achieve a single pin 25 architecture for TDM 
of the N signals, the scheme of Figure 9 is employed. In 

30 Figure 9, the architecture of Figure 7 is used to feed 

the outputs of the N drivers 21, and a single ISR 7 is 
used for the N input signals as in the Figure 1 
embodiment . 

Figure 10 shows N bi-directional pins 25, each with 
35 a different gating output signal. The single-pin 25 

architecture equivalent for TDM of the N signals uses the 
scheme shown in Figure 11. In Figure 11, the 
architecture of Figure 5 is employed for the N output 
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drivers 21 , and a single ISR 7 is applied to the N input 
signals as in the Figure 1 embodiment. 

Personalization instructions are injected into 
module 4, e.g., by techniques described in U.S. patent 
5 5,109,353 cited above. More than one logic design can be 

introduced into the module 4 by different personalization 
instructions. Software that is intended to provide 
automatic or interactive partitioning of the intended 
system design, such as Concept Silicon from InCA cited 

10 above, can exploit the knowledge of the existence of 

ISR f s 7 and OSR' s 9 to help pack more logic into each of 
the programmable chips 1 and/or to minimize the number of 
interconnection wires 19. 

The software is executed on a computer external to 

15 module 4, e.g., a workstation. The software can be 

standalone (not physically coupled to module 4) software 
that exploits the hardware architecture of the module .4 
as introduced by the invention. Alternatively, there can 
be physical coupling between the software and the module 

20 4, e.g., the computer on which the software is executed 

can have an electrical connection to the module 4, over 
which the personalization created by the software is 
downloaded into the module 4 . 

The benefit of minimizing the number of chips 1 is 

25 obvious. The benefit of minimizing the number of 

interconnects 19 is significant. If the prototype board 
4 is intended to utilize Aptix-type programmable 
interconnect chips, minimizing the number of 
interconnects 19 will reduce the number of Aptix 

30 components needed to achieve programmable interconnects. 

Either the software is told the number of virtual I/O's 
3,5 rather than the actual number; or else the software 
is told all about the ISR f s 7 and OSR' s 9 so that it will 
take this information into account when it does the 

35 partitioning. The programmable interconnect chip does 

not always need to be changed. Rather, ISR' s 7 and/or 
OSR f s 9 are placed on the chips 1 to which the 
programmable interconnect chip is connected, thereby 
enhancing the programmable interconnect chip. 
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Combining the use of the present invention with the 
use of one or more programmable interconnect chips can 
allow the board 4 designer the flexibility of creating 
additional interconnects 19 among chips 1. Programmable 
5 interconnect chips are used to interconnect among two or 

more I/O pins 3/5. These I/O pins 3,5 could also have 
ISR' s 7 and OSR' s 9 on board the chip l r thus providing 
greatly increased number of programmable interconnects. 
The configuration of the prototype board 4 is 
10 changed by a combination of hardware and software. For 

hardware changes, interconnects 19 are reprogrammed, 
FPGA' s and/or programmable interconnect chips 1 are 
reprogrammed, FPGA' s 1 are added or subtracted, and 
connections are obliterated using lasers. For software 
15 changes, software other than the partitioning software i 

used to, e.g., reprogram the Aptix chip(s) 1. 

The above description is included to illustrate the 
operation of the preferred embodiments and is not meant 
to limit the scope of the invention. The scope of the 
20 invention is to limited only by the following claims. 

From the above discussion, many variations will be 
apparent to one skilled in the art that would yet be 
encompassed by the spirit and scope of the invention. 
What is claimed is: 
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Claims 

1. A multichip i.e. module comprising: 

at least two integrated circuit chips, a first 
chip having at least one output shift register and a 
second chip having at least one input shift register; 

interconnections coupling the output shift 
register(s) and the input shift register (s); 

means for loading signals in parallel to the 
output shift register(s); and 

means for sequentially shifting signals through 
the output shift register (s) over the interconnections 
and into the input shift register (s) . 

2. The multichip i.e. module of claim 1 wherein 
all chips are clocked synchronously. 

3. The multichip i.e. module of claim 1 wherein at 
least one chip contains asynchronous logic, said chip 
comprising an input shift register that is coupled to 
said asynchronous logic via a polarity hold latch; 
wherein: 

said polarity hold latch is clocked by a P 

clock. 

4. The multichip i.e. module of claim 1 wherein at 
least some of the chips are gate arrays from the set 
comprising FPGA' s and non-field-programmable gate arrays. 

5. The multichip i.e. module of claim 4 wherein 
the output shift register (s) r the input shift 
register(s), and associated clocking are customized 
within peripheral regions of the gate arrays. 

6. The multichip i.e. module of claim 4 wherein 
the output shift register (s) and input shift register (s) 
are fabricated from logic normally present on the gate 
arrays . 

7. The multichip i.e. module of claim 1 wherein 
the means for loading signals is a parallel load clock. 

8. The multichip i.e. module of claim 1 wherein 
the module is a production module containing executable 
chips . 
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9. The multichip i.e. module of claim 1 wherein 
the multichip module is a prototype module containing at 
least some experimental chips . 

10. The multichip i.e. module of claim 1 further 
5 comprising software means to partition logic among the 

chips , said software means exploiting information 
concerning the particular architecture of the module. 

11 . The multichip i.e. module of claim 1 wherein 
said module contains a programmable interconnect chip. 

10 12. In a multichip i.e. module comprising at least 

two integrated circuit chips, each chip having a 
plurality of I/O pins, said module having inter-i.e. 
connection wires interconnecting said pins, a method for 
increasing the effective number of I/O pins, said method 

15 comprising the steps of: 

time division multiplexing signals at an output 
pin of a first chip; and 

sending said signals over a connection wire to 
an input pin of a second chip. 

20 13. The method of claim 12 wherein all chips are 

clocked synchronously - 

14. The method of claim 12 wherein said step of 
time division multiplexing comprises the substeps of: 

parallel loading signals into an output shift 
25 register within a first chip; and 

accessing signals in parallel from an input 
shift register within a second chip. 

15. The method of claim 14 wherein at least some of 
the chips are gate arrays from the set comprising FPGA' s 

30 and non-field-programmable gate arrays. 

16. The method of claim 15 wherein the output shift 
register (s) and input shift register (s) are customized 
within peripheral regions of said gate arrays . 

17. The method of claim 15 wherein the output shift 
35 register (s) and input shift register (s) are fabricated 

from logic normally present on said gate arrays. 

18. The method of claim 14 wherein the input and 
output shift registers comprise stages fabricated from 
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items from the group of items comprising flip-flops , edge 
triggered latches, and pairs of polarity hold latches. 

19. The method of claim 18 wherein one of said 
stages in reserved for debugging purposes. 
5 20. The method of claim 12 further comprising the 

additional step of determining a logic design for the 
module by use of partitioning software that exploits 
information concerning the particular architecture of the 
module. 

10 21. The method of claim 12 further comprising the 

step of observing said signals at said output pin for 
test purposes . 
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